2024-07-02
Introduction to the Grammar of Graphics
We will be looking at the different layers that is in ggplot2
The focus of this talk will be on the 20% that is useful 80% of the time
My goal is to make you excited about ggplot2!
I will entertain questions at the end
This workshop focuses on data visualization with ggplot2.
ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use (geoms), and it takes care of the details.
First published in 1999
A theoretical deconstruction of data graphics
Foundation for many graphic applications
The Grammar of Graphics can be applied to every type of plot
Concisely describe components
Your dataset
Tidy format
There is no visualization without a dataset
Aesthetics mapping: links variable in the data to graphical properties in the geometry.
We can specify the following properties within the aestetic mapping (colour, shape, alpha, fill, size).
Transform input variables to displayed values:
Bins for histogram
Summary statistics for boxplot using stat_boxplot()
No. of observations in a category for bar chart stat_count
Even tidy data may need some transformation
The statistics is linked to the geometry
Scales help us to control the mapping from data to aesthetics
Scales also provide the tools that let you interpret the plot: the axes and legends.
Scales are automatically generated in ggplot and can be customized
log scale
We can also specify limit within the scale
Scales help you interpret the plot
Geometries help us to interpret the aesthetics as graphical representation
Determines your plot type
geom_bar()geom_point()geom_boxplot()geom_histogram()Divide your data into panels using one or two groups
Allows you to look at smaller subsets of data
A coordinate system, maps the position of objects onto the plane of the plot.
It is also the physical mapping of the aesthetics to the paper
Coordinate systems affect all position variables simultaneously and differ from scales in that they also change the appearance of the geometric objects.
Coordinate systems control how the axes and grid lines are drawn.
This controls the overall look of the plot
Spans every part of the graphic that is not linked to the data
Themes give you control over things like fonts, ticks, panel strips, and backgrounds